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Abstract. We present a new way to detect basepair mismatches in DNA leading 
to different epigenetic disorder by the method of nanopore sequencing. Based on a 
tight-binding formulation of graphene based nanopore device, using Green’s function 
approach we study the changes in the electronic transport properties of the device as 
we translocate a double-stranded DNA through the nanopore embedded in a zigzag 
graphene nanoribbon. In the present work we are not only successful to detect the 
usual AT and GC pairs, but also a set of possible mismatches in the complementary 
base-pairing. 
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1. Introduction: 

Basepair mismatches in DNA is one of the major reason behind several mutagenic 
disorders which may lead to different genomic instabilities, development of cancer |1] 
and other degenerative diseases. Mismatch in DNA bases occurs mainly due to 
misincorporation of nitrogen bases during DNA replication, oxidative or chemical 
damages and ionizing radiations. Inspite of dramatic advancements in medical science, 
many crucial issues, such that how DNA detect and repair damages, individual 
mismatches or what is the most accurate observable physical parameter to detect 
basepair mismatch is still remain clouded. Apart from traditional fluorescence-based 
sequencing technique [2j, several other methods also applied to detect mismatches. 
Some examples are magnetic signatures [3], longitudinal electronic transport la M 
thermodynamic properties of basepair mismatches [6] and study of stretched DNA using 
AFM 1^, but no conclusive results appear. Whereas with the advent of nanopore- 
based sequencing [HI El cni [la [la E] a new pathway is opened for marker-free gene 
testing. In early days of nanopore sequencing people mostly used biological nanopores 
(tt-Haemolysin), detect the changes in ionic current as a single-stranded DNA (ss- 
DNA) passes through the pore P, P [TOl Cl]- With time, usage of nanopore materials 
also evolves from biological to solid state nanopores. The latter one overcomes many 
drawbacks of biological nanopores e.g., poor mechanical strength m, problems of 
integration with on-chip electronics [15]. Solid state nanopores also provides some other 
advantages like multiplex detection [16] and different detectable physical parameters 
other than ionic current [HI |T7l HH [131 CHI 1201 [2TI [22]. Though it provides so 
many advantages but it lacks in an important case, the average thickness of synthetic 
membranes used for molecular detection is of the order of 10 nm, which will occupy 
several nucleobases at a time (distance between two consecutive nitrogen bases in a 
DNA chain is 0.34 nm), jeopardizing single molecule base-specihc detection. Graphene, 
single layer of graphite [23], provides a solution to this problem. As the single layer 
thickness is of the order of the distance between two consecutive bases in DNA and with 
various advantageous properties [21] it is the ideal candidate for sequencing applications 
(recently other 2-D material, such as silicene also has been studied for the purpose of 
DNA sequencing H). Graphene also provides several ways of sequential detection 
e.g., nanoribbon conductance [2611271I28], transverse tunnelling [29l [30] . Readers can 
consult some review articles [3I113211331131] for a detailed description of nanopore based 
sequencing techniques. 

In this work we present a theoretical study to detect basepair mismatches in DNA 
using the method of nanopore sequencing. Though several studies on ss-DNA sequencing 
already exists in literature [29l [281 |26l |35l [36], there is no such report on double- 
stranded DNA (ds-DNA). We use a graphene nanopore based sequencing device which 
is created on single layer zigzag graphene nanoribbon (zgnr) following Ref. [28]. Using 
Landauer-Biittiker formalism we study the changes in electronic transport properties of 
the device as a ds-DNA (which also contains basepair mismatches) translocates through 
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the nanopore. Distinct features have been observed in transmission probability and 
to some extent in I-V response also for the canonical Watson-Crick pairs and for four 
different types of possible mismatches. Study of local density states (LDOS) also provide 
applicable insight. Our results open a new pathway for reliable detection of basepair 
mismatches in DNA, a highly important diagnosis for genetic disorder. 

2. Theoretical Formulation: 

To perform numerical study on the sequential determination of basepair mismatches in 
DNA we use zigzag graphene nanoribbon, with a pore created at the centre of it. We 
preserve the two-sublattice symmetry of graphene while creating the nanopore [37]. The 
whole zgnr system can be presented by an effective Hamiltonian (see Fig. [T|) 

N 

Hzgnr ^ A A H.C.^ (1) 

i=l 



Figure 1. (Color online). Schematic view of the ZGNR nanopore device with a ds- 
DNA passing through the nanopore. Current is lateral through the zgnr i.e., in the 
trnasverse direction. 

where e is the site-energy of each carbon atom in ZGNR, and t is the nearest 
neighbour hopping amplitude. Cj and c| creates or annihilates an electron at the 
ith site respectively. For calculation of transport properties we also use semi-inhnite 
zgnr as electrodes [2H]. Thus the total Hamiltonian of the system can be written as 
Hfot = Hzgnr + Hieads + Htun , where Hfun represents tunneling Hamiltonian between the 
nanopore device and electrodes. In our calculations we scale energy in terms of t i.e., 
we set t=1.0 eV. Hamiltonian of a ds-DNA can be expressed as 

N 

HDNA ^ ^ A tijCj^jCij^ij A H.C.^ 

i=l j=l,ll 

N 

+ {4iCiii A H.c.) 

i=l 


( 2 ) 
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where c|j and Cij are the electron creation and annihilation operators at the zth nncleotide 
of the jth strand, tij = nearest neighbonr hopping amplitnde between nncleotides along 
the jth strand, Cij = on-site energy of the nncleotides, v = interstrand hopping between 
the nncleobases. 

Green’s fnnction formalism is nsed for both the LDOS and transport properties 
calcnlations. Transmission probability of an electron with an energy E is given 
by T{E) = Tt[TlG^TrG<^] |38], where = [G“]t and ri(^) = - S2 (h)]. 

G^ = [E — Hzgnr — — YEr + irf\~^ is the single-particle retarded Green’s fnnction 

for the entire system at an energy E, where tun represents 

retarded (advanced) self energies of the left (right) zgnr electrodes which is calcnlated 
following recnrsive Green’s fnnction techniqne [391110]. is the retarded (advanced) 

Green’s fnnction of the left (right) lead. At absolnte zero temperatnre, nsing Landaner 
formula, current through the nanopore device for an applied voltage V is given by 
I{V) = ^ T{E)dE where Ep being the Fermi energy. Here we assume 

that there is no charge accumulation within the system. The LDOS prohles of the 
basepairs trapped inside the nanopore are given by p{E,i) = ■“ylKi[Gii(E)] where, 
G{E) = {E — H + iri)~^ is the Green’s function for the zgnr system including the 
basepairs with electron energy E as r; —)■ 0^, TT = Hamiltonian of the zgnr-nanopore, 
and, Im represents imaginary part of Gii{E). Gii{E) is the diagonal matrix element 
(< i\G{E)\i >) of the Greens function, \i > being the Wannier state associated with 
the trapped nucleotide. 

3. Results: 

For the purpose of numerical investigation we use ionization potentials of the nitrogen 
bases as their site energies which are extracted from the ab-initio calculations [H]: eG= 
8.178, eA= 8.631, ec= 9.722,and 6^= 9.464, all units are in eV. Then we shift the 
reference point of energy to the average of the ionization potentials of the nncleobases 
which is 8.995 eV, and with respect to this new origin of energy the on-site energies for 
the bases G, A, G, and T become -0.82 eV, -0.37 eV, 0.72 eV, and 0.47 eV respectively. 
This is valid for model calculations as it won’t do any qualitative damage to the results. 
Similar methods have previously been employed where the average of ionization potential 
is set as the backbone site-energy m 

In Fig. [2] we show the LDOS prohles for the four different nitrogen bases. We 
study this LDOS response of the bases as a part of the Watson-Grick basepairs not as 
individual z.e., we trap the AT and GG pairs inside the nanopore and study the LDOS 
prohle of the respective bases. The position of different peaks in the LDOS are different, 
close to the characteristic site energies of the different nucleotides and the peak values 
are also different. These relative differences in LDOS patterns present a chance to detect 
the basepairs using ARPES technique by trapping them inside the nanopore. As the 
LDOS behaviour is mostly dominated by the nitrogen bases not by the backbones [35] 
this also provides a new way of biomolecular detection. 
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Figure 2. (Color online). LDOS of the four nucleotides trapped at the nanopore. 
There are four distinct peaks of different heights representing different bases close to 
their characteristic site-energy. 




Energy(eV) Energy(eV) 



Energy(eV) 



Figure 3. (Color online). Transmission probability T(E) as a function of energy 
for different cases, a) Comparison between a bare nanopore and GC-nanopore. b) 
Comparison between two Watson-Crick pairs AT and GC. c) Characteristic features 
of four different mismatched basepairs trapped inside the nanopore, d) Enlarged view 
of the plot (c) for clear visualization. 


In Fig. |3] we plot the variation in transmission probability for different cases. 
The conpling parameter between the boundary sites of the zgnr-nanopore and DNA 
base is set to 0.2 eV. Intrastrand hopping parameter between identical bases in the 
DNA chain is taken as tij=0.35 eV and for different bases tij=0.17 eV. Whereas 
interstrand hopping between nucleobases is taken as v=0.035 eV, one order of 
magnitude less than the intrastrand hopping. These values are consistent with previous 
reports [l2l 031 HU |l5l |l6] . Fig. [3a] shows the comparison between a bare nanopore and 
a DNA basepair ( GC pair ) trapped into the nanopore. The changes in transmission 
spectra are clearly distinguishable. There are characteristic peaks in the prohle both 
at the +ve and -ve energy range. Both the curves for bare nanopore and GC-nanopore 
are symmetric with respect to zero of energy, as the two-sublattice symmetry of the 
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graphene nanopore is preserved in both the cases. It was violated in case of ss-DNA 
sequencing [37]. Fig. [3b] shows the difference between the characteristic features of 
two Watson-Crick pairs AT and GC. Distinct peaks are present in the transmission 
prohle at and around the characteristic site energies of the respective nucleobases. In 
Fig. [3c] we show the relative changes in the transmission prohle for four different types 
of basepair mismatches. Each of the mismatches has distinct response at and around 
their respective site energies. Variations are quite similar in +ve and -ve energy range. 
They are clearly distinguishable at low energy, and the characteristic features die down 
as we move towards higher energy values. It is due to the fact that as we go to higher 
energy we are moving away from the characteristic site energies of the nucleobases. In 
Fig. [3d] we zoom in a small energy window of Fig. [3c] for better visualization. TC 
mismatch has distinct peak around 0.3 eV. GT and AG mismatches become clearly 
distinguishable between 0.4 to 0.45 eV and 0.6 to 0.65 eV respectively. Whereas AG 
becomes visibly distinct in the energy range 0.8 to 0.9 eV. 




Figure 4. (Color online). Current - Voltage response of the active nanopore device for 
different cases, a) Comparison of the current responses between a bare nanopore and 
GC-nanopore. b) Difference between characteristic current amplitudes of two Watson- 
Crick basepairs AT and GC. c) Attributes of four different mismatches (AG, AC, GT, 
TC). Insets show selective voltage ranges for better visualization. Ep=Q eV represents 
Fermi energy. 

In Fig. [4a] we show changes in the I-V characteristics for a bare nanopore and 
a GG-nanopore. Effect of the basepair inside the nanopore becomes prominent at 
considerable bias, inset shows a specihc high voltage range of the curves where they are 
clearly distinguishable. Fig. [4b] shows the variation in the current response between two 
Watson-Grick pairs AT and GG. They also become differentiable at high voltage range 
between 1.7 to 2.0 Volt. AT pair produces higher current than GG pair, which reflects 
their different electronic structure, as this current response depends on how the local 
charge density prohle modihed due to the insertion of the DNA bases [26]. Fig. [4c] 
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shows the relative differences between four possible mismatches of basepairs. At low 
bias differences between them is very faint, they gradually become differentiable as we 
increase the bias. Insets of the Fig. [4b] and Fig. [4c] show specific voltage windows 
within which the mutual separation between the basepairs is larger than elsewhere. 



Basepair Index 



Basepair Index 


Figure 5. (Color online). Left panel shows the stop and go translocation of a Random 
ATGC ds-DNA chain through the nanopore, while bias across the device is fixed 
to a specific value which gives maximum separation in current response for different 
basepairs. We record the characteristic current output for the bases as they translocate 
through the nanopore. The respective basepairs and mismatches are indicated in the 
figure with their usual symbols (AT, GC etc.). Right panel shows the same variation for 
a ds-DNA chain with no basepair mismatch for better understanding of the left panel 
figure. Though the current is presented in arb. unit as we report a model calculation, 
but if we put the exact numerical values of different constants like h, e and h, it turns 
out of the order of 10 gA. 


In Fig. |5]we finally show the sequencing application to detect basepair mismatches 
along with the two canonical pairs AT and GC. We take a 30-basepair long Random 
ATGC chain, translocate it through the nanopore and record the characteristic current 
signals corresponding to the different basepairs including the mismatches. During this 
translocation bias is kept at 1.72 Volt, this voltage gives maximum possible relative 
separation between the characteristic currents of different basepairs (see insets of Fig. [4b] 
and Fig. [4c]). Separation between a Canonical pair GC and a mismatch TC is maximum 
whereas that between AT and AC is minimum. The reason behind this is G and T 
are from different group, G is from purine group and T is from pyrimidine, electronic 
structure of them are also quite different. So when the pairing changes from GC to TC, 
the corresponding change in current response is also big. While for AT and AC, both T 
and C are from the same pyrimidine group, hence the relative changes in the response is 
also quite smaller. These relative changes in the current response represent the difference 
in their electronic structure. If we define a new quantity to measure the sensitivity of this 
type of sequencing devices e.g., percentage separation ={Imax — hmin)/hmin, h turns out 
to be that maximum and minimum values of percentage separation achieved are 17.30% 
and 3.23% which implies that the current signals for the respective basepairs can be 
detected with much more reliability. We also plot a separate figure (see right panel of 
Fig. [5]) for a normal ds-DNA chain without any mismatches, for better understanding 
of the effect of mismatches on the current response of the device. It is also important 
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to mention that though we have presented current in arbitrary unit, but if we put 
numerical values of various constants e.g., h, e and h, it turns out that the currents 
are of the order of 10 /lA which is much higher than previous reports on ss-DNA 
sequencing as well as much greater than the noise level of this type of devices which 
is of the order of nA [2H]. Very recently a report by Feliciano et al. [32] on dynamical 
effects of environment on operation of graphene based sequencing devices shows that 
fluctuations of the nucleotides inside the nanopore may change the conductance of the 
devices relying on tunneling mechanism, though they conclude that these effects would 
not be very important for the devices which relies on transverse conductance with larger 
transmission probability. As our proposed device relies on transverse conductance and 
produces greater current output, effect of these type of noises will be much lesser. 
Whereas another study by Krems et al. [48] in 2009 dealing with different types of noises 
which may occur in actual sequencing experiments showed that these environmental 
effects do not strongly influence the current distributions and working efficiency of 
these devices. Though based upon these results we can say that the overall sensitivity 
of our device won’t be hampered too much but there will always be sources of noise 
in actual experimental condition due to environmental fluctuations, presence of water 
and counterions which can affect the device operation. It is also important to note 
that it is one of the early attempt to detect basepair mismatches by means of nanopore 
sequencing and the results given in this work is open to improvement in different ways. 
One example is, by functionalization of the edge atoms of the nanopore which can 
significantly enhance nucleobase-pore interaction, thus reducing the structural noise by 
enhancing the graphene-nucleobase electronic coupling [l9l[50]. Different types of groups 
can be used for functionalization (e.g., hydroxide [51], amine or nitrogen [28]) to provide 
custom made solution to overcome noise in electrical DNA sequencing techniques. It 
is also true for the devices relying on transverse conductance that most of the current 
passes through the edges of the nanoribbon which is one of the reason of poor sensitivity 
of these type of devices, but this can be controlled with accurate engineering of the 
nanopore device dimension. See Appendix section for more details on this. 

4. Conclusion: 

In summary we present an effective and reliable technique to detect basepair mismatches 
in a given DNA sample. We analyze different properties from LDOS to I-V response 
in connection with sequential determination and found distinguishable signatures in 
most of the cases. Most of the earlier results on DNA sequencing use ss-DNA which 
neglect the basic problem of basepair mismatch leading to different neuro-degenerative 
diseases. As the different genetic diseases occur due to mismatch of base-pair i.e., when 
a nitrogen base in a DNA double-helix paired up with another base which is not the 
complementary pair of it, sequencing of ss-DNA can’t provide this information. On the 
other hand previous attempts to detect basepair mismatches do not provide any decisive 
results. With time both medical science and genetic research progress, the reasons 


Detection of hasepair mismatches in DNA using graphene based nanopore device 9 

behind different genetic disorder including neuro-degenerative ones (like Perkinsons, 
Alzheimer etc) are becoming more and more transparent. With this progress the need 
for low cost and reliable DNA sequencing also increases which should also provide the 
necessary technique for proper medical applications. In this circumstances we present a 
reliable tight-binding scheme to detect basepair mismatches in DNA with much better 
accuracy than previous studies [52]. At the same time, we also understand that proposed 
technique needs more improvements for actual application in real environment and hope 
it will soon be tested with further modifications. 

5. Appendix: 

In this section we provide some additional information on basepair detection of DNA. 
In Fig. [6] we plot the variation in the current response of our proposed device for AT 
and TA basepairs, both are being Watson-Crick pair. Now for the previous calculation 
we preserve the two-sublattice symmetry of graphene by symmetrically connecting the 
nucleotides with edge atoms, in this configuration it is hard to distinguish AT and TA 
separately. For better detectability we destroy the two-sublattice symmetry and find 
distinct responses. The same also has been done for detection between GC and CG. 
We want to mention that we checked all our results with broken sublattice symmetry, 
but find no significant changes for the results presented in the earlier sections. The 
percentage seaparation between AT and TA (GG and GG) is relatively small (1.5%) 
which implies that the proposed device is not effective in the same way as it is for 
basepair mismatches. 



Basepair Index Basepair Index 


Figure 6. (Color online). Current - Voltage response of the active nanopore device 
for two Watson-Crick pairs in opposite orientation. Figure on the left side shows 
comparison of the current responses between a AT-nanopore and TA-nanopore. Right 
panel shows the same for GC-nanopore and CG-nanopore. 

We also check the sensitivity of the device on the nanoribbon width. To investigate 
this we make the zgnr width double than previous results but keep the pore size fixed. 
In Fig. [7] we plot the sequential determination i.e., stop and go translocation of a 
ds-DNA chain containing mismatches through the zgnr-nanopore with increased width. 
With increasing width current output increases, which is trivial as the width increases 
conductance of the device will also increase and so the current. But the sensitivity 
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Figure 7. (Color online). Stop and go translocation of a Random ATGC ds-DNA 
chain through the nanopore, while bias across the device is fixed. The zgnr used for this 
case has double width than that is used for Fig. [5]. We record the characteristic current 
output for the bases as they translocate through the nanopore. Current output is 
greater than Fig. [5] but the variation in the responses for different basepairs including 
mismatches decreased slightly. 


decreases to some extent. As we keep the pore size fixed, the fraction of the current 
passing around the pore will decrease and signature of the basepair will die out with 
increasing width as the presence of the basepairs modify this current only which is 
detected by the device. For the previous case (Fig. [5]) the range of current variation is 
0.09 (arb. unit) for different basepairs which reduces to 0.06 (arb. unit) as we doubled 
the width of the zgnr. 

Following the above results (Fig. [7]) we can say that there are several issues 
compete in the sequential detection technique. First thing is that to get higher current 
output from the device one has to increase the ribbon width, but it will also hamper 
device sensitivity to some extent. In order to maintain the desired accuracy one has to 
increase the nanopore dimension with increasing ribbon width. Increasing the pore size 
will increase the fraction of current passing around pore and the effect of the basepairs 
will also become more vivid. Because only the changes in the current passing around the 
nanopore due to the presence of the basepair is detected by the device. And to reduce 
the fluctuations of the basepairs inside the nanopore during translocation the edge atoms 
of the nanopore has to be functionalized with different groups |2Hl EH |32] as discussed 
in the earlier section. Thus, in case of sequential determination process of DNA or 
biomolecules there are several parameters which have to be optimized accordingly for 
accurate and precise measurement. 
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